Chinese Named Entity Recognition and Disambiguation Based on Wikipedia
نویسندگان
چکیده
This paper presents a method for named entity recognition and disambiguation based on Wikipedia. First, we establish Wikipedia database using open source tools named JWPL. Second, we extract the definition term from the first sentence of Wikipedia page and use it as external knowledge in named entity recognition. Finally, we achieve named entity disambiguation using Wikipedia disambiguation pages and contextual information. The experiments show that the use of Wikipedia features can improve the accuracy of named entity recognition.
منابع مشابه
The Task 2 of CIPS-SIGHAN 2012 Named Entity Recognition and Disambiguation in Chinese Bakeoff
The CIPS-SIGHAN 2012 Chinese Named Entity Recognition and Disambiguation (NERD) bake-off was held in the summer of 2012. Named entity recognition and disambiguation is an important task in natural language processing and knowledge base construction. It aims at detecting entity mentions in raw text, followed by pointing the detected mentions to real world entities. Often, real world entities can...
متن کاملبهبود شناسایی موجودیتهای نامدار فارسی با استفاده از کسره اضافه
Named entity recognition is a process in which the people’s names, name of places (cities, countries, seas, etc.) and organizations (public and private companies, international institutions, etc.), date, currency and percentages in a text are identified. Named entity recognition plays an important role in many NLP tasks such as semantic role labeling, question answering, summarization, machine ...
متن کاملLarge-Scale Named Entity Disambiguation Based on Wikipedia Data
This paper presents a large-scale system for the recognition and semantic disambiguation of named entities based on information extracted from a large encyclopedic collection and Web search results. It describes in detail the disambiguation paradigm employed and the information extraction process from Wikipedia. Through a process of maximizing the agreement between the contextual information ex...
متن کاملExploiting WordNet for Wikipedia-Based Named Entity Disambiguation
Entity disambiguation is an important problem in semantic analysis and natural language processing. In this paper, we propose an approach to employ features of the WordNet ontology in the task of disambiguating named entities to Wikipedia. Methods of enriching text with synonymous relations of words are explored. An analysis of the results from our experiments shows that the accuracy of the dis...
متن کاملNamed Entity Linking Based On Wikipedia
In this paper, we present the ideas and methodologies on labeling the mentioned entities with the wiki dataset. This paper presents a system for the recognition and semantic disambiguation of named entities based on information extracted from a large encyclopedic collection from Wikipedia. We focus on maximizing the similarity between the contextual information extracted from Wikipedia and the ...
متن کامل